177 research outputs found

    An Approximate Subgame-Perfect Equilibrium Computation Technique for Repeated Games

    Full text link
    This paper presents a technique for approximating, up to any precision, the set of subgame-perfect equilibria (SPE) in discounted repeated games. The process starts with a single hypercube approximation of the set of SPE. Then the initial hypercube is gradually partitioned on to a set of smaller adjacent hypercubes, while those hypercubes that cannot contain any point belonging to the set of SPE are simultaneously withdrawn. Whether a given hypercube can contain an equilibrium point is verified by an appropriate mathematical program. Three different formulations of the algorithm for both approximately computing the set of SPE payoffs and extracting players' strategies are then proposed: the first two that do not assume the presence of an external coordination between players, and the third one that assumes a certain level of coordination during game play for convexifying the set of continuation payoffs after any repeated game history. A special attention is paid to the question of extracting players' strategies and their representability in form of finite automata, an important feature for artificial agent systems.Comment: 26 pages, 13 figures, 1 tabl

    Hypergame Analysis in E-Commerce: A Preliminary Report

    Get PDF
    In usual game theory, it is normally assumed that "all the players see the same game", i.e., they are aware of each other's strategies and preferences. This assumption is very strong for real life where differences in perception affecting the decision making process seem to be the rule rather the exception. Attempts have been made to incorporate misperceptions of various types, but most of these attempts are based on quantities (as probabilities, risk factors, etc.) which are too subjective in general. One approach that seems to be very attractive is to consider that the players are trying to play "different games" in a hypergame. In this paper, we present a hypergame approach as an analysis tool in the context of multiagent environments. Precisely, we first sketch a brief formal introduction to hypergames. Then we explain how agents can interact through communication or through a mediator when they have different views and particularly misperceptions on others' games. After that, we show how agents can take advantage of misperceptions. Finally, we conclude and present some future work. Dans les jeux classiques, il est supposĂ© que "tous les joueurs voient le mĂȘme jeu'', i.e., que les joueurs sont au courant des stratĂ©gies et des prĂ©fĂ©rences des uns et des autres. Aux vu des applications rĂ©elles, cette supposition est trĂšs forte dans la mesure oĂč les diffĂ©rences de perception affectant la prise de dĂ©cision semblent plus relevĂ©es de la rĂšgle que de l'exception. Des tentatives ont Ă©tĂ© faites, par le passĂ©, pour incorporer les distorsions aux niveaux des perceptions, mais la plupart de ces tentatives ont Ă©tĂ© essentiellement basĂ©es sur le "quantitatif" (comme les probabilitĂ©s, les facteurs de risques, etc.) et par consĂ©quent, trop subjectives en gĂ©nĂ©ral. Une approche qui semble ĂȘtre attractive pour pallier Ă  cela, consiste Ă  voir les joueurs comme jouant "diffĂ©rents jeux'' dans une sorte d'hyper-jeu. Dans ce papier, nous prĂ©sentons une approche "hyper-jeu'' comme outil d'analyse entre agents dans le cadre d'un environnement multi-agent. Nous donnons un aperçu (trĂšs succinct) de la formalisation d'un tel hyper-jeux et nous expliquerons ensuite, comment les agents pourraient intervenir via un agent-mĂ©diateur quand ils ont des perceptions diffĂ©rentes. AprĂšs cela, nous expliquerons comment les agents pourraient tirer avantage des perceptions diffĂ©rentes.Game Theory, Hypergame, Mediation, ThĂ©orie des jeux, hyper-jeux, mĂ©diation

    Dynamic Y-KD: A Hybrid Approach to Continual Instance Segmentation

    Full text link
    Despite the success of deep learning models on instance segmentation, current methods still suffer from catastrophic forgetting in continual learning scenarios. In this paper, our contributions for continual instance segmentation are threefold. First, we propose the Y-knowledge distillation (Y-KD), a technique that shares a common feature extractor between the teacher and student networks. As the teacher is also updated with new data in Y-KD, the increased plasticity results in new modules that are specialized on new classes. Second, our Y-KD approach is supported by a dynamic architecture method that trains task-specific modules with a unique instance segmentation head, thereby significantly reducing forgetting. Third, we complete our approach by leveraging checkpoint averaging as a simple method to manually balance the trade-off between performance on the various sets of classes, thus increasing control over the model's behavior without any additional cost. These contributions are united in our model that we name the Dynamic Y-KD network. We perform extensive experiments on several single-step and multi-steps incremental learning scenarios, and we show that our approach outperforms previous methods both on past and new classes. For instance, compared to recent work, our method obtains +2.1% mAP on old classes in 15-1, +7.6% mAP on new classes in 19-1 and reaches 91.5% of the mAP obtained by joint-training on all classes in 15-5

    Multi-item Auctions for Automatic Negotiation

    Get PDF
    Available resources can often be limited with regard to the number of demands. In this paper we propose an approach for solving this problem which consists of using the mechanisms of multi-item auctions for allocating the resources to a set of software agents. We consider the resource problem as a market in which there are vendor agents and buyer agents trading on items representing the resources. These agents use multi-item auctions which are viewed here as a process of automatic negotiation, and implemented as a network of intelligent software agents. In this negotiation, agents exhibit different acquisition capabilities which let them act differently depending on the current context or situation of the market. For example, the "richer" an agent is, the more items it can buy, i.e. the more resources it can acquire. We present a model for this approach based on the English auction, then we discuss experimental evidence of such a model. Dans un environnement multiagent, les ressources peuvent toujours s'avĂ©rer insuffisantes relativement Ă  un nombre Ă©levĂ© de demandes. Dans ce cahier, nous proposons une approche mixant les enchĂšres et les agents logiciels en vue de contribuer Ă  rĂ©soudre ce problĂšme. Cette approche consiste en fait Ă  utiliser le mĂ©canisme d'enchĂšres multi-articles en vue d'allouer les ressources Ă  un ensemble d'agents. À cet effet, nous considĂ©rons le problĂšme de ressources comme un marchĂ© dans lequel Ă©voluent des agents acheteurs et des agents vendeurs nĂ©gociant des articles reprĂ©sentant des ressources. Ces agents utilisent des enchĂšres multi-articles et par consĂ©quent ils constituent un processus de nĂ©gociation automatisĂ© et programmĂ© comme un rĂ©seau d'agents logiciels. Dans ce type de nĂ©gociation, chaque agent exhibe diffĂ©rentes capacitĂ©s d'acquisition lui permettant ainsi d'agir diffĂ©remment selon le contexte ou la situation de marchĂ©. Par exemple, plus on est riche, plus on peut acheter d'articles. Nous prĂ©sentons pour ce modĂšle une enchĂšre anglaise et nous discuterons ses rĂ©sultats expĂ©rimentaux.Multi-agent systems, Negotiations, Multi-item auctions, SystĂšmes multiagents, nĂ©gociations, enchĂšres multi items

    Generative Adversarial Positive-Unlabelled Learning

    Full text link
    In this work, we consider the task of classifying binary positive-unlabeled (PU) data. The existing discriminative learning based PU models attempt to seek an optimal reweighting strategy for U data, so that a decent decision boundary can be found. However, given limited P data, the conventional PU models tend to suffer from overfitting when adapted to very flexible deep neural networks. In contrast, we are the first to innovate a totally new paradigm to attack the binary PU task, from perspective of generative learning by leveraging the powerful generative adversarial networks (GAN). Our generative positive-unlabeled (GenPU) framework incorporates an array of discriminators and generators that are endowed with different roles in simultaneously producing positive and negative realistic samples. We provide theoretical analysis to justify that, at equilibrium, GenPU is capable of recovering both positive and negative data distributions. Moreover, we show GenPU is generalizable and closely related to the semi-supervised classification. Given rather limited P data, experiments on both synthetic and real-world dataset demonstrate the effectiveness of our proposed framework. With infinite realistic and diverse sample streams generated from GenPU, a very flexible classifier can then be trained using deep neural networks.Comment: 8 page

    Multi-agent coordination based on tokens : reduction of the bullwhip effect in a forest supply chain

    Get PDF
    In this paper, we focus on the supply chain as a multi-agent system and we propose a new coordination technique to reduce the fluctuations of orders placed by each company to its suppliers in such a supply chain. This problem of amplification of the demand variability is called the bullwhip effect. To reduce such a bullwhip effect, we propose a technique based on tokens to achieve a decentralized coordination. Precisely, classical orders manage the demand itself whereas tokens manage effects on company inventory due to variations of this demand. Finally, the proposed approach is validated by the Wood Supply Game, which is a supply chain model used to make players aware of the bullwhip effect. We experimentally verify that our coordination technique leads to less variable orders (i.e. the standard deviation of orders is reduced) while inventory levels are not excessively high but sufficient to avoid backorders.
    • 

    corecore